Improved estimation of supervision in unsupervised speaker adaptation

نویسندگان

  • Shigeru Homma
  • Kiyoaki Aikawa
  • Shigeki Sagayama
چکیده

Unsupervised speaker adaptation plays an important role in \batch dictation," the aim of which is to automatically transcribe large amounts of recorded dictation using speech recognition. In the case of unsupervised speaker adaptation which uses recognition results of target speech as the means of supervision, erroneous recognition results degrade the quality of the adapted acoustic models. This paper presents a new supervision selection method. By using this method, correction of the rst candidate is judged based on the likelihood ratio of the rst and the second candidates. This method eliminates erroneous recognition results and corresponding speech data from the adaptive training data. We implemented this method in the iterative unsupervised speaker adaptation procedure. It is shown that the recognition errors are drastically reduced by 50% in a practical application of batch-style speech-to-text conversion of recorded dictation of Japanese medical diagnoses compared with speaker-independent recognition.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

On-Line Unsupervised Adaptation in Speaker Verification: Confidence-Based Updates and Improved Parameter Estimation

This paper presents the second part of a new approach to on-line unsupervised adaptation in speaker verification. The new approach extends previous work in the literature by (1) improving performance on the enrollment handset-type when adapting on a different handset-type (e.g., improving performance on cellular when adapting on a landline office phone), (2) accomplishing this cross channel imp...

متن کامل

Iterative unsupervised speaker adaptation for batch dictation

This paper describes an automatic batch-style dictation paradigm in which the entire dictated speech is fully utilized for speaker adaptation and is recognized using the speaker adaptation results. The key point is that the same speech data is used both for recognition as the target and for speaker adaptation. Two steps, speech recognition and speaker adaptation which uses recognition results a...

متن کامل

Unsupervised speaker adaptation using high confidence portion recognition results by multiple recognition systems

This paper describes an accurate unsupervised speaker adaptation method for lecture speech recognition using multiple LVCSRs. In an unsupervised speaker adaptation framework, the improvement of recognition performance by adapting acoustic models greatly depends on the accuracy of labels such as phonemes and syllables. Therefore, extraction of the adaptation data guided by the confidence measure...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997